Real-time, scalable, content-based Twitter users recommendation

نویسندگان

  • Julien Subercaze
  • Christophe Gravier
  • Frédérique Laforest
چکیده

Real-time recommendation of Twitter users based on the content of their profiles is a very challenging task. Traditional IR methods such as TF-IDF fail to handle efficiently large datasets. In this paper we present a scalable approach that allows real time recommendation of users based on their tweets. Our model builds a graph of terms, driven by the fact that users sharing similar interests will share similar terms. We show how this model can be encoded as a compact binary footprint, that allows very fast comparison and ranking, taking full advantage of modern CPU architectures. We validate our approach through an empirical evaluation against the Apache Lucene’s implementation of TF-IDF. We show that our approach is in average two hundred times faster than standard optimized implementation of TF-IDF with a precision of 58 %.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards an expressive and scalable Twitter profile hash for users recommendation

Microblogging websites such as Twitter produce tremendous amounts of data each second. Identifying people to follow is a heavy task that cannot be completely done by users. Consequently, real time recommendation systems require very efficient algorithm to quickly process this massive amount of data, so as to recommend users having similar interests. In this paper we present a tractable algorith...

متن کامل

HashGraph : an expressive and scalable Twitter users profile for recommendation

Microblogging websites such as Twitter produce tremendous amounts of data each second. Identifying people to follow is a heavy task that cannot be completely done by users. Consequently, real time recommendation systems require very efficient algorithm to quickly process this massive amount of data, so as to recommend users having similar interests. In this paper we present a tractable algorith...

متن کامل

Automatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach

In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...

متن کامل

Terms of a Feather: Content-Based News Recommendation and Discovery Using Twitter

User-generated content has dominated the web’s recent growth and today the so-called real-time web provides us with unprecedented access to the real-time opinions, views, and ratings of millions of users. For example, Twitter’s 200m+ users are generating in the region of 1000+ tweets per second. In this work, we propose that this data can be harnessed as a useful source of recommendation knowle...

متن کامل

GraphJet: Real-Time Content Recommendations at Twitter

This paper presents GraphJet, a new graph-based system for generating content recommendations at Twitter. As motivation, we trace the evolution of our formulation and approach to the graph recommendation problem, embodied in successive generations of systems. Two trends can be identified: supplementing batch with real-time processing and a broadening of the scope of recommendations from users t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Web Intelligence

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2016